Universal Access To All Knowledge
Home Donate | Store | Blog | FAQ | Jobs | Volunteer Positions | Contact | Bios | Forums | Projects | Terms, Privacy, & Copyright
Search: Advanced Search
Anonymous User (login or join us)

Reply to this post | Go Back
View Post [edit]

Poster: stilett0 Date: Aug 3, 2009 7:55pm
Forum: faqs Subject: new site owner robots.txt supercedes previous owner's?

Starroms.com used to be a website where you could purchase arcade ROMs licensed from Atari for personal use.

However they shut down years ago and closed the website. I was interested in looking at the archive.

Now a new owner has purchased the domain, and they have implemented robots.txt protection.

Somehow, I believe the current owner's robots.txt supercedes the previously archived site. When I followed this link:
http://web.archive.org/web/20080209130137/www.starroms.com/robots.txt earlier, it said something about searchinformation.com, which is a common advertising portal...

I'm not sure if this is a bug, a flaw inherent in the design, or what. Maybe the original owner had a thorough robots.txt. Maybe I am misinterpreting things.

Reply to this post
Reply [edit]

Poster: garthus Date: Aug 4, 2009 7:00am
Forum: faqs Subject: Re: new site owner robots.txt supercedes previous owner's?

This concerns me, does this mean that if someone dies and their domain is taken over that all of their site info will be deleted or changed?


Reply to this post
Reply [edit]

Poster: stilett0 Date: Aug 4, 2009 12:22pm
Forum: faqs Subject: Re: new site owner robots.txt supercedes previous owner's?

So I've been doing some research by searching the forums, and it has been discussed before.


I think, to take care of things properly...
1. The Internet Archive would have to keep track of all domain name change-of-ownerships for all sites in their archive, which is frankly impossible. There are a few websites that attempt to cache domain name histories to save the information of the previous owners, like http://domain-history.domaintools.com/ and http://openaccess.dialog.com/ip/ (now dead) or http://www.who.is/domain_archive-com

What I would do is get that domain name ownership cache and use it to define when I get access to websites histories and when I don't.

I really need to find all the forum threads about domain camping and robots.txt to see what the current thinking is from Archive.org managment.

I can tell you this: the current status is SO not quo.

Terms of Use (31 Dec 2014)